Double-Talk and Path Change Detection Using A Matrix of Correlation Coefficients 
Background Of The Invention 

1. Field of Invention 

[0001] This invention relates to a method of detecting double-talk and path changes in 
5 echo cancellation systems. Echo cancellation is used extensively in telecommunications 
applications to recondition a wide variety of signals, such as speech, data transmission, 
and video. 

2 . Description of Related Art 

[0002] The search for an effective echo cancellation procedure has produced several 
U 1 0 different approaches with varying degrees of complexity, cost, and performance. A 
- traditional approach to echo cancellation uses an adaptive filter of length L, where L 

£ equals the number of samples necessary to extend to just beyond the duration of the echo, 

rij Typically, the adaptive filters contain either 5 1 2 or 1 024 taps. At the standard telephone 

3 bit rate of 8000 samples per second, this provides the ability to adapt to echo paths as long 

L. 15 as 64 ms and 1 28 ms, respectively. 

[0003] The computational requirements of an adaptive filter are proportional to L for the 
^ popular LMS (Least Mean Squares) class of algorithms, and proportional to L 2 or higher 

; Z for algorithms such as RLS (Recursive Least Squares). More robust algorithms (like RLS) 

have greatly improved convergence characteristics over LMS methods, but the L 2 
20 computational load makes them impractical with current technology. For this reason, the 

LMS algorithm (and its variants) tends to remain the algorithm of choice for echo 

cancellation. 

[0004] Practical echo cancellation devices must provide some means of avoiding 
divergence from double-talk. The double-talk condition arises when there is simultaneous 
25 transmission of signals from both sides of the echo canceller due to the presence of near- 
end speech in addition to the echo. Under such circumstances, the return echo path signal, 
Sin (see Figure 1), contains both return echo from the echo source signal, and a double- 
talk signal. The presence of a double-talk signal will prevent an LMS-based echo 
canceller from converging on the correct echo path. It will also cause a pre-converged 
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echo canceller to diverge to unpredictable states. Following divergence, the echo canceller 
will no longer cancel the echo, and must reconverge to the correct solution. Such 
behaviour is highly unacceptable, and is to be avoided in actual devices. Some means of 
detecting double-talk must therefore be implemented. To prevent divergence, the LMS 

5 filter coefficients are typically frozen during the presence of double-talk. 

[0005] Detecting double-talk quickly and reliably is a notoriously difficult problem. Even 
a small amount of divergence in a fully converged LMS filter will result in a significant 
increase in the residual echo level. The use of a fast and reliable double-talk detector is 
crucial to maintain adequate subjective performance. 

10 [0006] The simplest, and perhaps most common, method for detecting double-talk is to 
use signal levels. The echo path typically contains a minimum amount of loss, or 
reduction, in the return signal. This quantity is often referred to as the Echo Return Loss, 
or ERL. In most systems, this is assumed to be at least 6 dB. In other words, the return 
signal Sin will be at a level which is at least 6 dB lower than R 0 ut provided that there is 

1 5 no double-talk. In the presence of double-talk, the level at Sin often increases so that it is 
no longer 6 dB lower than Rout- This condition provides a simple and convenient test for 
double-talk. 

[0007] The problem with this approach is that the double-talk detector must have an 
accurate estimate of the echo path ERL in order to determine if the level at Sin is too high. 
20 However, precise knowledge of the ERL is generally not available. If the ERL estimate is 
too high, the double-talk detector may trigger unnecessarily. Conversely, it may not 
trigger at all if the ERL estimate is too low. 

[0008] Another problem with this technique is that it will only reliably detect high-level 
double-talk. If the double-talk signal is at a much lower level than the echo source signal, 
25 low-level double-talk occurs. Under this condition, the increase in the level of Sin is 

usually very small. The double-talk detector may fail to trigger, but noticeable divergence 
in the LMS filter can still occur. 

[0009] To detect low-level double-talk, the level of the residual echo signal (S 0 ut) is 
often monitored. If no double-talk or background noise is present, and the LMS filter is 
30 fully converged, S 0 ut can be as much as 40 dB lower than Rout- Assuming that the echo 

2 



path remains constant, any increase in Sout will likely be due to double-talk. Of course, if 
the echo path does change, it will be mistaken for double-talk. So if this method is used, a 
separate path change detection algorithm must be employed. A unified approach would be 
simpler and preferred. 

5 [0010] Correlation is a statistical function which is commonly used in signal processing. 
It can provide a measure of the similarity between two signals (cross-correlation), or a 
single signal and time-shifted versions of itself (autocorrelation). The use of correlation 
for double-talk detection per se is known. Several patents exist for correlation-based 
double-talk detection, including US5646990, US5526347 and US51931 12. The 
10 correlation-based approaches taken in prior-art methods generally involve the calculation 
M . of a single cross-correlation coefficient, usually between Rjn and S^. The problem with 

this technique is that the degree of correlation can vary widely with different signals and 
=C echo paths. This makes it very difficult to set thresholds on the correlation coefficient in 

m 

m order to determine what state the echo canceller is in. 

W 

sj 15 Summary Of The Invention 

O [0011] A process has been developed which generates matrix coefficients using zero-lag 

auto and cross-correlations from signals commonly found in echo cancellers. Double-talk 
and path changes are then detected using matrix operations such as determinants, 

fU eigendecompositions, or singular value decompositions (SVDs). 

20 [0012] The correlations between various signals in an echo canceller will change 

depending on what state the echo canceller is in, i.e. if it is converged, unconverged, or in 
double-talk. By arranging the various correlations in appropriate matrix form, key 
information about the state of the echo canceller can be extracted by performing various 
matrix operations. The preferred operation is to take the determinant , but 

25 eigendecompositions and singular value decompositions (SVDs) can also be used. A 
novel aspect of the invention is the formulation of a matrix using various correlation 
coefficients, and the subsequent analysis of this matrix to determine the state of the echo 
canceller. 

[0013] Accordingly the present invention provides a method of detecting double-talk and 
30 path changes in an echo cancellation system, comprising generating a correlation-based 



matrix of signals in said echo cancellation system; and analyzing said correlation-based 

matrix to identify double-talk and path changes occurring in said system. 

[0014] In the preferred embodiment, the correlation-based matrix is generated using the 

return echo signal (Sin) and the output of an LMS adaptive filter. 
5 [0015] The invention provides a correlation-based matrix is generated using zero-lag auto 

and cross-correlations of signals commonly found in echo cancellers. 

[0016] Double-talk and path changes are detected by analysis of the correlation-based 

matrix. Possible analysis techniques include condition numbers, determinants, 

eigendecompositions, and singular value decompositions. 
10 [0017] In the preferred embodiment, determinants are used to detect double-talk and path 

changes. 

[0018] The invention can be implemented using either the time-domain or frequency- 
domain in a digital signal processor using conventional digital signal processing 
techniques. 

1 5 [0019] The invention also provides a double-talk and path change detector, comprising a 
processing element generating a correlation-based matrix of signals in said echo 
cancellation system; and a processing element for analyzing said correlation-based matrix 
to identify double-talk and path changes occurring in said system. 

Brief Description Of The Drawings 
20 [0020] The invention will now be described in more detail, by way of example only, with 
reference to the accompanying drawings, in which;- 

Figure 1 is a schematic diagram of an echo canceller using LMS Adaptive Filtering; and 
Figure 2a is a plot showing the value of det [R] under normal convergence; 
Figure 2b is a plot showing the value of det [R] with a path change; and 
25 Figure 2c is a plot showing the value of det [R] with double-talk. 
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Detailed Description Of The Invention 

[0021] The layout of a typical LMS-based echo canceller is shown in Figure 1 . It contains 
two signals, travelling along a "send" path and a "receive" path. The echo source signal 
enters the echo canceller as Rin and leaves as Rout- The send path input, Sin, consists of a 
double-talk signal (if present) plus the echo source signal after it has travelled along the 
echo path. By estimating the echo path, a synthetic echo signal can be generated to cancel 
the echo in the send path. The echo cancelled signal leaves as S 0 ut- 
[0022] The LMS filter attempts to cancel the echo by adjusting itself to suppress the 
output signal at Sout. Obviously, if Sin contains components other than echoed speech 
from the echo source, the LMS filter will not converge to the correct solution; hence the 
need for double-talk detection. 

[0023] The preferred embodiment of the algorithm for this patent uses the Normalized- 
LMS (N-LMS) algorithm. Mathematically, the adaptive filter tap-weight update procedure 
for the N-LMS algorithm consists of the following three equations 

^ d[n] = w H [n]u[n] 
p e[n] = d[n]-d[n] 

w[n+l] =w[n] + ^ -u[n]e[n] 

a + |„[n]|| 2 



[0024] where 



jj, u t n 3 = Rjn = echo source signal 

|± w [ n] = L MS filter coefficients 

U, d P 1 1 = Sin = desired LMS output (echo + double-talk) 

U. d [ n l = LMS output (estimated echo) 

fx e[n] = Sout = LMS error signal 

|Li = LMS step-size parameter 

\l a = A small constant (provides numerical stability). 



[0025] The location of these signals in the echo canceller is also shown in Figure 1 . The 
N-LMS algorithm well known to persons skilled in the art and a more detailed treatment 
is readily available in most adaptive filtering texts. See, for example, S. Haykin, Adaptive 
Filter Theory, Prentice-Hall, Upper Saddle River, NJ (1996), the contents of which are 
5 herein incorporated by reference. 

[0026] One of the key parameters in the N-LMS algorithm is the LMS step-size parameter 
\i. This parameter controls both the speed and accuracy of convergence. The larger |i is, 
the faster the algorithm will converge on the echo path, but the less accurate the steady- 
state solution will be. To guarantee convergence of the N-LMS algorithm, \i must be less 
10 than 2. 

[0027] A common technique is to adjust the value of (I based on the state of the echo 
canceller. In an unconverged state (such as at start-up, or following a path change), it is 
desirable to use a large value for fi to permit rapid initial convergence. Once the LMS 
filter has achieved a reasonable degree of convergence, u\ can be reduced. This not only 
1 5 allows for a slightly more accurate solution (and therefore more cancellation), but it will 
also slow potential divergence from double-talk. To stop adaptation altogether, u\ can 
simply be set to zero. The double-talk and path change detectors can therefore control the 
operation of the LMS filter by varying the value of fi. 

[0028] A double-talk detection algorithm in accordance with a preferred embodiment of 
20 the invention that is designed to work in conjunction with the echo canceller described 
illustrated in Figure 1 will be described. This is implemented in a digital signal processor. 

[0029] Consider two signals, X 0 [n] and X][n] generated by a linear combination of two 
real-valued source signals, S 0 [n] and Si[n]. Mathematically, this mixing process may be 
described as 

25 \i X=H t0 -S 0 +H iy S x , 

[0030] where H u are the mixing coefficients. In matrix form, this may be written as 
(i X = H S 

[0031] where 
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n x= 




[0032] A matrix R is defined as 



[0033] where E[. ..] is the statistical expectation operator. R may be expanded in two 
ways 



[0034] From the first expansion, it is apparent that the diagonal terms in R are the zero- 
lag autocorrelations of Xo[n] and Xj[n] and that both off-diagonal terms correspond to the 
zero-lag cross-correlation between X 0 [n] and Xj[n]. Hence, R is a symmetric, 
correlation-based matrix. 

[0035] From the second expansion, we see that if H is full-rank, then R will also be full- 
rank if So[n] and Si[n]are both non-zero and uncorrelated. In most cases, a sufficient 
condition for this is that S 0 [n] and S t [n] are different signals from different sources. 

[0036] The way in which the matrix can be used to perform double-talk and path change 
detection will now be explained. First, suppose we generate the signal mixtures in using 
convolutions: 



[0037] Now the terms in the mixing matrix can be vectors. We further impose the 
condition that H have the following form: 




ll =£[HSS r H r ] 



H X=H®S 




[0038] With H defined in this way, it is now possible to connect the terms in the 
preceding equations with the parameters available in the echo canceller layout shown in 
Figure 1. Let 

\i So = echo source signal = Rin = u[n] 
5 U. Si = double-talk signal 

ll H 0 ,o = echo path 

p. Hi ; o = LMS filter coefficients = w[n] 
[0039] With these definitions, it is apparent that 

li X <j =H 0fi ®S 0 + S l =S IN =d[n] 

10 \i X } =H l0 ®S 0 =d[n] 

[0040] The question of what happens to R under the various states of echo canceller 
operation will now be examined. 

[0041] Case 1 : Unconverged, no double-talk 

[0042] If the LMS filter is in an unconverged state, H 0 ,o s 6 Hi,o. This situation occurs 
15 when the echo canceller is first started, or following a major echo path change. Since the 
LMS filter does not contain an accurate echo path estimate, X 0 & X\, and R will be full 
rank (unless Hi )0 = 0, but this condition is usually temporary) with a very low condition 
number. See, for example, G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd 
ed., Johns Hopkins University Press, Baltimore, MD (1996). (k-10 1 ). As convergence 
20 proceeds, the degree of correlation between X 0 and Xi and increases. This has the effect 
of rapidly raising the condition number of R. As a result, the determinant of R will fall, 
and its eigenvalues and singular values will become increasingly disparate. 

[0043] Case 2: Converged, no double-talk 

[0044] In this state H 0 ,o = Hi, 0 ., so X 0 ~ X\. This will make R very nearly rank deficient, 
25 and its condition number very large (k~10 6 ). Since R is close to being singular, its 
determinant will become very small. Similarly, we would expect to find only one 
significant eigenvalue or singular value. 



[0045] Case 3: Double-talk 

[0046] When double-talk is occurring, X 0 contains components from both S 0 and Si, 
while X] is derived solely from S 0 . In this case, Xi and X 0 and are highly uncorrelated. 
R will have a low condition number, and this will be sustained for the duration of the 
5 double-talk. The higher the double-talk level, the lower the condition number becomes. 
This will raise the determinant of R, and we will find two significant eigenvalues and 
singular values. 

[0047] Once the matrix R is generated, a variety of operations are available to determine 
what state the echo canceller is in. The condition number, determinant, eigenvalues and 
10 singular values of can all be used to test for double-talk or path changes. The determinant 
is used in the preferred embodiment because it is the simplest matrix operation to 
perform. 

[0048] To illustrate the effectiveness of this algorithm at detecting double-talk and path 
changes, simulations were carried out and the results are shown in Figure 2. The plots 

15 indicate the value of det [R] under normal convergence, a path change, and double-talk. 
The scaling of the y-axis on the plots clearly demonstrates the variations observed in det 
[R] under the three different states. The simulations were carried out using ITU CSS 
synthetic speech signals from the G.168 Digital Echo Canceller standard. ITU-T 
Recommendation G.168, Digital Echo Cancellers. The signals were 48000 samples long, 

20 and a 60 ms echo path was used (which was changed to 1 5 ms during the path change 
simulation). 

[0049] Under normal convergence (Fig. 2a), det [R] rapidly decays to near-zero values. 
When a path change occurs (Fig. 2b), det [R] spikes to a large value and then decays (to 
emphasize this trend, convergence was slowed by a factor of 10 following the path 
25 change). With double-talk (bottom plot), even larger, but sustained, spikes are present in 
det [R]. The differences in these three plots make it very easy to tell what state the echo 
canceller is in simply by checking the level of det [R]. The highest levels indicate double- 
talk, medium levels (along with decay) occurs with path changes, and very low levels are 
characteristic of full convergence. Based on these results, thresholds can be set as follows: 
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- Normal (converged) operation. 

- Path change detected. 
Double-talk detected. 

[0050] Once the state of the echo canceller is determined, the LMS filter operation can be 
adjusted accordingly. 

[0051] A well-known relation in signal processing is that the convolution of two signals 
in time is equivalent to the multiplication of their frequency spectra. This property makes 
it possible to propose a variation on the preceding algorithm in which frequency-domain 
versions of the signals are used. X has been defined in the time-domain using 
convolutions: 

[0052] X W = H t n ]®S[n] 

[0053] By taking the Fourier Transform of all terms involved, it is possible to rewrite the 
above equation in the frequency-domain as 

[00 5 4] X(tk) = H<fk)S(fk) 
[0055] for all frequencies in the range 0 < f k , < F s /2 where F s is the sampling frequency 
of the signals. The generation and analysis of the correlation-based matrix R is carried out 
as before, only now R is created using the frequency-domain version of X. 

[0056] The advantage to this approach is that the algorithm no longer needs to have an 
accurate echo path estimate for R to have a high condition number during non-double-talk 
states. The double-talk detector becomes completely insensitive to path changes. 
Depending on the application, this may or may not be a desirable property. Low-level 
double-talk detection abilities improve, but a separate path change detection scheme must 
now be used. 

[0057] Implementation of a frequency-domain version of this process can be 
accomplished through the use of Fast Fourier Transforms (FFTs) or subbanding 
techniques. 
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[0058] As will be understood by persons skilled in the art the inventive process can be 
implemented in a digital signal processor or other suitable digital signal processing 
device. 



5 [0059] Glossary 

Adaptive Filter: A filter whose coefficients can be adjusted during operation. Adaptive 
filters are used to estimate unknown parameters, for example an unknown echo path. 

Autocorrelation: A statistical quantity which roughly measures the similarity of a signal 
to time shifted versions of itself. 

10 Condition Number: A measure of how close a matrix is to being singular. The condition 

number for an arbitrary matrix A is given by K (^) ~~ ll^lil^ I. 

Convergence: The condition achieved when the LMS filter has accurately modelled the 
echo path and is no longer undergoing significant changes. At convergence, the LMS 
filter is cancelling the maximum amount of echo. 

15 Cross-Correlation: A statistical quantity which roughly measures the similarity of two 
separate signals. 

Divergence: The process by which the LMS filter coefficients move away from the actual 
echo path to erroneous and unpredictable solutions. During divergence, the amount of 
echo being cancelled becomes less and less. 

20 Double-Talk: The condition which occurs during simultaneous transmission of signals 
from both sides of the echo canceller. 

Echo Path: A mathematical description of the process which imparts an echo to a signal. 

ERL: Echo Return Loss. The loss a signal experiences as it travels along the echo path 
from Rout to Sen. 

25 ERLE: Echo Return Loss Enhancement. A common method of measuring the 

performance of an echo canceller. This measurement represents the amount that an echo 
signal has been reduced from Sin to Sour- 
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LMS Algorithm: Least Mean Squares algorithm. Common adaptive filtering technique. 

N-LMS Algorithm: Normalized Least Mean Squares algorithm. A variation on standard 
LMS in which the tap- weight update term is scaled by the inverse of the input signal 
power. 

Rank: The number of non-zero eigenvalues or singular values a matrix has. Full-rank 
matrices have a non-zero determinant, and are thus non-singular and invertible. 

RLS Algorithm: Recursive Least Squares algorithm. Common adaptive filtering 
technique. 

It will be appreciated by one skilled in the art that many variations of the invention are 
possible without departing from the scope of the appended claims. 
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